Mining developer communication data streams

نویسندگان

  • Andy M. Connor
  • Jacqui Finlay
  • Russel Pears
چکیده

This paper explores the concepts of modelling a software development project as a process that results in the creation of a continuous stream of data. In terms of the Jazz repository used in this research, one aspect of that stream of data would be developer communication. Such data can be used to create an evolving social network characterized by a range of metrics. This paper presents the application of data stream mining techniques to identify the most useful metrics for predicting build outcomes. Results are presented from applying the Hoeffding Tree classification method used in conjunction with the Adaptive Sliding Window (ADWIN) method for detecting concept drift. The results indicate that only a small number of the available metrics considered have any significance for predicting the outcome of a build.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows

Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...

متن کامل

Collective Sequential Pattern Mining in Distributed Evolving Data Streams

The advances in processing and communication techniques resulted in a multitude of emerging applications that interact with streams of data. Traditional data mining systems store arriving data, collect them for later mining, and make multiple passes over the collected data. Unfortunately, these systems are prohibitively slow when they deal with data streams with massive amounts of data arriving...

متن کامل

Mining Data Bases and Data Streams

Data mining represents an emerging technology area of great importance to homeland security. Data mining enables knowledge discovery on databases by identifying patterns that are novel, useful, and actionable. It has proven successful in many domains, such as banking, ecommerce, genomic, investment, telecom, web analysis, link analysis, and security applications. In this chapter, we will survey...

متن کامل

A Scalable Distributed Stream Mining System for Highway Traffic Data

To achieve the concept of smart roads, intelligent sensors are being placed on the roadways to collect real-time traffic streams. Traditional method is not a real-time response, and incurs high communication and storage costs. Existing distributed stream mining algorithms do not consider the resource limitation on the lightweight devices such as sensors. In this paper, we propose a distributed ...

متن کامل

Component-based Framework for Mobile Data Mining with Support for Real-Time Sensors

The increasing use of various mobile devices has shown that there is a need for mobile data mining applications. While many existing data mining frameworks can be modified to handle data streams generated in real time, they are usually too complex and inflexible to be used in mobile devices. This paper presents Mobile Smart Archive, a component-based framework for data stream mining in mobile d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1407.6104  شماره 

صفحات  -

تاریخ انتشار 2014